statisticalfandomcom-20200214-history
Hypothesis tests and Confidence intervals
it may be impossible to discover the true value of \theta , hence some estimate of parameter \theta is required and is denoted \tilde{\theta} confidence intervals provide a way to quantify the uncertainty in \hat{\theta} there are some issues in discussing random values and their realizations using the normal convention of X and x estimates of \hat{\theta} tend to {\theta} when the MLE is used. Confidence intervals are used to summarize the uncertainty in \tilde{\theta} . There is a common theory underlying hypothesis tests and confidence intervals. Hypothesis testing Hypotheses a hypothesis is a statement about the particular value that some value or paramater \theta might take from parameter space \Omega H:\theta = 0 scientific questions are typically formed in terms of hypotheses or statements about the world, and a hypothesis is a statement some particular value that a parameter might take. so some \theta in \Omega \theta \in \omega null and alternative hypothesis The null hypothesis is denoted with the zero subscript like so H_{0}:\theta = 0 H_0:\theta \in \omega then H_1:\theta \in \Omega - \omega (is is also accurate to put) \notin \omega The alternative hypothesis is often denoted with a 1 subscript H_1:\theta \notin \omega rejection region ermm.. why? R \in \mathbb{R}^n R = \{ x:t(x) > c \} steps in testing @todo * define a null hypothesis * * collect data x_1, n_2, x_n * define a rejection region * usually compared to a test statistic * critical value c Composite Hypotheses where there are multiple parameters Fixed level testing approach to hypothesis testing p-value Likelihood Ratio The likelihood ratio provides a measure with which to evaluate the relative strength of evidence for two simple hypotheses. If X \sim f(x|\theta) and L(\theta)=L(\theta|x_1,x_2,...,x_n) LR = \frac{ L(\theta_1)}{L(\theta_0) } log( LR ) = log ( \frac{ L(\theta_1)}{L(\theta_0) } ) = l(\theta_1)-l(\theta_0) Neyman-Pearson Lemma is that when performing a hypothesis test between two simple hypotheses H_0:\theta=\theta_0 and H_1:\theta=\theta_1 the likelihood ratio test is the most powerful test of size \alpha for any value of \alpha generalized likelihood ratio test Is the likelihood ratio test applied to composite hypotheses as composite hypotheses do not determine exact values of \theta confidence intervals Free parameters the number of unspecified parameters under a given hypothesis is also called the number of free parameters Other approximate statistics the likelihood ratio test can always be used to construct a confidence region various approximations to the likelihood ratio test, which are equivalent asymptotically as the sample size grows large alternatives are sometimes easier to apply. Wald tests "The Wald tests are derived by expanding the log-likelihood at its null value \theta_0 , i.e. l(\theta_0) , around its value at the MLE theta hat using Taylor series" M347 type 1 this approximates 2 log (LR) W_1 = (\theta_0 - \hat{\theta})^2 E \{-l''(\theta)\}|_\hat{\theta} type 2 W_2 = (\theta_0 - \hat{\theta})^2 E \{-l''(\theta)\}|_{\theta_0} Score Tests The l'(\theta) is known as the score function. by the defintion of the MLE, the score function takes the value of 0 at the MLE \hat{\theta} S = \frac{\{l'(\theta_0)\}^2}{E\{-l''(\theta)\}|_{\theta_0}} if the null hypothesis is true, then S approximates 2 log (LR) in large sames, and its asymptotic null distribution is X^2(1) Confidence Intervals The first derivative of the log-likelihood is known as the score function. A confidence interval represents a plausible range of values for an estimate also known as an interval estimator if the variance is large, then the interval will be wide